智能论文笔记

The Robustness of Tether Friction in Non-idealized Terrains

Justin J. Page , Laura K. Treers , Steven Jens Jorgensen , Ronald S. Fearing , Hannah S. Stuart

分类：机器人

2022-08-22

减少的牵引力限制了移动机器人系统抵抗或施加大型外部负载的能力，例如拉紧有效载荷。一种简单且通用的解决方案是将束缚在天然发生的物体周围，以利用卡普斯坦效应并呈指数放大的固定力。实验表明，理想化的Capstan模型解释了对常见不规则室外物体（树木，岩石，柱子）经历的力放大。适用于可变环境条件，这种指数放大方法可以串联或与机器人团队并行利用单个或多个capstan对象。这种适应性允许一系列潜在配置，对于当对象无法完全包围或抓住时，特别有用。这些原则已通过移动平台证明（1）控制有效载荷的降低和逮捕，（2）以实现有效载荷的平面控制，以及（3）充当更大范围平台的锚点。我们显示了一个简单的系绳，包裹在沙子上的浅石头上，放大了低牵引力平台的持有力量，最多可达774倍。

translated by 谷歌翻译

Virtual Axle Detector based on Analysis of Bridge Acceleration Measurements by Fully Convolutional Network

Steven Robert Lorenzen , Henrik Riedel , Maximilian Michael Rupp , Leon Schmeiser , Hagen Berthold , Andrei Firus , Jens Schneider

分类：计算机视觉

2022-07-08

在实际应用桥梁称重（BWIM）方法中，车辆通过期间车轮或车轴的位置在大多数情况下是先决条件。为了避免使用常规轴检测器和桥梁类型特定的方法，我们提出了一种新的方法来通过在桥梁的任何点上放置加速度计来检测轴检测。为了开发尽可能简单且可理解的模型，将轴检测任务实现为二进制分类问题，而不是回归问题。该模型被用作完全卷积网络，以连续小波变换的形式处理信号。这允许在单个步骤中以最大效率处理任何长度的段落，同时在单个评估中使用多个量表。这使我们的方法能够在桥结构的任何位置使用加速信号，该位置用作虚拟轴检测器（VADS），而无需仅限于特定的结构类型的桥梁。为了测试提出的方法，我们分析了在长途交通线的钢槽铁路桥上记录的3787列火车通道。我们在测量数据上的结果表明，我们的模型检测到轴的95％，因此，正确检测到了134,800个以前看不见的轴的128,599。总共可以以20厘米的最大空间误差检测到90％的车轴，最大速度为$ v _ {\ mathrm {max}} = 56,3〜 \ mathrm {m/s} $。分析表明，即使在实际操作条件下，我们开发的模型也可以使用加速度计作为VAD。

translated by 谷歌翻译

A Tutorial on Parametric Variational Inference

Jens Sjölund

分类： (统计)机器学习 | 机器学习

2023-01-03

Variational inference uses optimization, rather than integration, to approximate the marginal likelihood, and thereby the posterior, in a Bayesian model. Thanks to advances in computational scalability made in the last decade, variational inference is now the preferred choice for many high-dimensional models and large datasets. This tutorial introduces variational inference from the parametric perspective that dominates these recent developments, in contrast to the mean-field perspective commonly found in other introductory texts.

translated by 谷歌翻译

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Steven H. Wang , Antoine Scardigli , Leonard Tang , Wei Chen , Dimitry Levkin , Anya Chen , Spencer Ball , Thomas Woodside , Oliver Zhang , Dan Hendrycks

分类：自然语言处理

2023-01-02

Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.

translated by 谷歌翻译

ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports

Katharina Jeblick , Balthasar Schachtner , Jakob Dexl , Andreas Mittermeier , Anna Theresa Stüber , Johanna Topalis , Tobias Weber , Philipp Wesp , Bastian Sabel , Jens Ricke

分类：自然语言处理 | 机器学习

2022-12-30

The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.

translated by 谷歌翻译

Current State of Community-Driven Radiological AI Deployment in Medical Imaging

Vikash Gupta , Barbaros Selnur Erdal , Carolina Ramirez , Ralf Floca , Laurence Jackson , Brad Genereaux , Sidney Bryson , Christopher P Bridge , Jens Kleesiek , Felix Nensa

分类：人工智能

2022-12-29

Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.

translated by 谷歌翻译

Statistical Distance Based Deterministic Offspring Selection in SMC Methods

Oskar Kviman , Hazal Koptagel , Harald Melin , Jens Lagergren

分类： (统计)机器学习 | 机器学习

2022-12-23

Over the years, sequential Monte Carlo (SMC) and, equivalently, particle filter (PF) theory has gained substantial attention from researchers. However, the performance of the resampling methodology, also known as offspring selection, has not advanced recently. We propose two deterministic offspring selection methods, which strive to minimize the Kullback-Leibler (KL) divergence and the total variation (TV) distance, respectively, between the particle distribution prior and subsequent to the offspring selection. By reducing the statistical distance between the selected offspring and the joint distribution, we obtain a heuristic search procedure that performs superior to a maximum likelihood search in precisely those contexts where the latter performs better than an SMC. For SMC and particle Markov chain Monte Carlo (pMCMC), our proposed offspring selection methods always outperform or compare favorably with the two state-of-the-art resampling schemes on two models commonly used as benchmarks from the literature.

translated by 谷歌翻译

Multilingual News Location Detection using an Entity-Based Siamese Network with Semi-Supervised Contrastive Learning and Knowledge Base

Víctor Suárez-Paniagua , Steven Derby , Tri Kurniawan Wijaya

分类：自然语言处理 | 人工智能

2022-12-22

Early detection of relevant locations in a piece of news is especially important in extreme events such as environmental disasters, war conflicts, disease outbreaks, or political turmoils. Additionally, this detection also helps recommender systems to promote relevant news based on user locations. Note that, when the relevant locations are not mentioned explicitly in the text, state-of-the-art methods typically fail to recognize them because these methods rely on syntactic recognition. In contrast, by incorporating a knowledge base and connecting entities with their locations, our system successfully infers the relevant locations even when they are not mentioned explicitly in the text. To evaluate the effectiveness of our approach, and due to the lack of datasets in this area, we also contribute to the research community with a gold-standard multilingual news-location dataset, NewsLOC. It contains the annotation of the relevant locations (and their WikiData IDs) of 600+ Wikinews articles in five different languages: English, French, German, Italian, and Spanish. Through experimental evaluations, we show that our proposed system outperforms the baselines and the fine-tuned version of the model using semi-supervised data that increases the classification rate. The source code and the NewsLOC dataset are publicly available for being used by the research community at https://github.com/vsuarezpaniagua/NewsLocation.

translated by 谷歌翻译

Deep set conditioned latent representations for action recognition

Akash Singh , Tom De Schepper , Kevin Mets , Peter Hellinckx , Jose Oramas , Steven Latre

分类：计算机视觉

2022-12-21

In recent years multi-label, multi-class video action recognition has gained significant popularity. While reasoning over temporally connected atomic actions is mundane for intelligent species, standard artificial neural networks (ANN) still struggle to classify them. In the real world, atomic actions often temporally connect to form more complex composite actions. The challenge lies in recognising composite action of varying durations while other distinct composite or atomic actions occur in the background. Drawing upon the success of relational networks, we propose methods that learn to reason over the semantic concept of objects and actions. We empirically show how ANNs benefit from pretraining, relational inductive biases and unordered set-based latent representations. In this paper we propose deep set conditioned I3D (SCI3D), a two stream relational network that employs latent representation of state and visual representation for reasoning over events and actions. They learn to reason about temporally connected actions in order to identify all of them in the video. The proposed method achieves an improvement of around 1.49% mAP in atomic action recognition and 17.57% mAP in composite action recognition, over a I3D-NL baseline, on the CATER dataset.

translated by 谷歌翻译

From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models

Jiaxian Guo , Junnan Li , Dongxu Li , Anthony Meng Huat Tiong , Boyang Li , Dacheng Tao , Steven C. H. Hoi

分类：计算机视觉

2022-12-21

Large language models (LLMs) have demonstrated excellent zero-shot generalization to new language tasks. However, effective utilization of LLMs for zero-shot visual question-answering (VQA) remains challenging, primarily due to the modality disconnection and task disconnection between LLM and VQA task. End-to-end training on vision and language data may bridge the disconnections, but is inflexible and computationally expensive. To address this issue, we propose \emph{Img2Prompt}, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training. In order to provide such prompts, we further employ LLM-agnostic models to provide prompts that can describe image content and self-constructed question-answer pairs, which can effectively guide LLM to perform zero-shot VQA tasks. Img2Prompt offers the following benefits: 1) It can flexibly work with various LLMs to perform VQA. 2)~Without the needing of end-to-end training, it significantly reduces the cost of deploying LLM for zero-shot VQA tasks. 3) It achieves comparable or better performance than methods relying on end-to-end training. For example, we outperform Flamingo~\cite{Deepmind:Flamingo2022} by 5.6\% on VQAv2. On the challenging A-OKVQA dataset, our method even outperforms few-shot methods by as much as 20\%.

translated by 谷歌翻译